Combining Structured and Unstructured Randomness in Large Scale PCA
نویسندگان
چکیده
Principal Component Analysis (PCA) is a ubiquitous tool with many applications in machine learning including feature construction, subspace embedding, and outlier detection. In this paper, we present an algorithm for computing the top principal components of a dataset with a large number of rows (examples) and columns (features). Our algorithm leverages both structured and unstructured random projections to retain good accuracy while being computationally efficient. We demonstrate the technique on the winning submission the KDD 2010 Cup.
منابع مشابه
Distributed and Scalable PCA in the Cloud
Principal Component Analysis (PCA) is a popular technique with many applications. Recent randomized PCA algorithms scale to large datasets but face a bottleneck when the number of features is also large. We propose to mitigate this issue using a composition of structured and unstructured randomness within a randomized PCA algorithm. Initial experiments using a large graph dataset from Twitter s...
متن کاملTechniques for Visualizing 3d Unstructured Meshes
We present a computational module for interactively visualizing, large-scale, 3D un-structured meshes. Scientists and engineers routinely solve large-scale computational boundary value problems on unstructured grids. These grids typically range from several hundred thousand elements to millions of elements. With this ability to solve such large-scale problems comes the challenge of viewing the ...
متن کاملOn Adding Structure to Unstructured Overlay Networks
Unstructured peer-to-peer overlay networks are very resilient to churn and topology changes, while requiring little maintenance cost. Therefore, they are an infrastructure to build highly scalable large-scale services in dynamic networks. Typically, the overlay topology is defined by a peer sampling service that aims at maintaining, in each process, a random partial view of peers in the system....
متن کاملStructured Sparse Principal Component Analysis
We present an extension of sparse PCA, or sparse dictionary learning, where the sparsity patterns of all dictionary elements are structured and constrained to belong to a prespecified set of shapes. This structured sparse PCA is based on a structured regularization recently introduced by [1]. While classical sparse priors only deal with cardinality, the regularization we use encodes higher-orde...
متن کاملTechniques for Visualizing D Unstructured Meshes
We present a computational module for interactively visualizing large scale D un structured meshes Scientists and engineers routinely solve large scale computational boundary value problems on unstructured grids These grids typically range from several hundred thousand elements to millions of elements With this ability to solve such large scale problems comes the challenge of viewing the D nite...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1310.6304 شماره
صفحات -
تاریخ انتشار 2013